Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chen Wang

Hye-Young

Consistency evaluation of benchmarks used for causal discovery

Jun 01, 2026

Yuzhe Zhang, Chihui Chen, Lina Yao, Chen Wang

Abstract:In graphical causal model, causal discovery aims to construct a causal graph based on numerical data and domain knowledge in plain text. However, the evaluation of causal discovery methods remains a challenge in the area as the progress of domain researches often makes benchmark causal graphs contain mis-aligned knowledge. This problem especially affects the evaluation of large language model (LLM) based causal discovery methods as they are sensitive to the new discoveries in the literature. This work is the first to systematically study the quality of benchmark causal graphs. Specifically, we design a pipeline that automatically retrieves relevant research papers from scientific databases, and prompts LLMs to check the consistency between the benchmark causal graphs and domain research papers. We evaluate 11 popular real-world benchmarks, for which our pipeline in total proceeds 38,081 domain papers. Our results show that popular benchmarks vary significantly in their consistency with domain research, with clear implications for causal discovery research.

Via

Access Paper or Ask Questions

Attend to Evidence: Evidence-Anchored Spatial Attention Supervision for Multimodal RLVR

May 29, 2026

Ruina Hu, Chen Wang, Lai Wei, Jionghao Bai, Bin Yu, Weiran Huang, Kai Wang, Yue Wang

Abstract:Reinforcement learning with verifiable rewards (RLVR) improves vision-language models (VLMs) by optimizing outcome rewards derived from final answers. However, such outcome-only rewards do not tell the model which image regions justify an answer. For questions that require visual grounding, these rewards cannot distinguish responses supported by relevant visual evidence from those produced by language-prior shortcuts or lucky guesses. We introduce EASE (Evidence-Anchored Spatial Attention), which augments multimodal RLVR with visual-evidence process supervision. EASE converts annotated evidence regions into a smoothed visual-token target and uses it to guide response-to-image attention during RL training, but only on high-reward trajectories. The annotations are used solely as privileged training labels, while inference requires only the original image and question. Across Qwen2.5-VL-7B, Qwen3-VL-4B, and Qwen3-VL-8B, EASE raises average scores over DAPO by 2.5 to 3.1 points on perception, hallucination, visual math, and multimodal reasoning benchmarks. Diagnostics and ablations show that EASE better aligns visual attention with annotated evidence regions.

Via

Access Paper or Ask Questions

HTAM: Hierarchical Transition-Attended Memory for Operator Optimization

May 28, 2026

Yining Zhang, Mingyang Yi, Chen Wang, Xuwen Xiang, Tianhe Jia, Zedong Dan, Chengqing Zong, Yue Wang

Abstract:High-performance GPU kernels are essential for efficient LLM deployment, yet optimizing them remains expertise-intensive. Recent LLM-based code generation makes automatic GPU operator generation promising, but operator optimization remains a hardware-aware search problem. Existing LLM-based methods face a granularity mismatch: coarse hints are reusable but hard to execute, whereas detailed memories are actionable but enlarge the search space and obscure optimization bottlenecks. The key challenge is therefore to organize optimization experience at an appropriate granularity. To address this issue, this paper proposes HTAM (Hierarchical Transition-Attended Memory), a coarse-to-fine framework for LLM-based operator optimization. HTAM builds a two-level Hierarchical Transition Graph (HTG) to organize coarse global directions, detailed local strategies, and transition experience between optimization steps. During each evolution step, HTAM selects a global direction from the current state and recent optimization history, retrieves the corresponding local strategy memory, and uses it to guide concrete CUDA code generation. Experiments on the full KernelBench suite demonstrate that HTAM consistently improves correctness, fast-solution rate, and speedup over LLM-based baselines, while backend and Robust-KBench studies indicate transferable benefits from structured memory.

* 24 pages, 5 figures

Via

Access Paper or Ask Questions

Unified Analytical Framework for SPAD Array Receivers with Dead-Time-Induced Blocking Loss and Inter-Symbol Interference in PAM-OWC Systems

May 27, 2026

Chen Wang, Zhiyong Xu, Jingyuan Wang, Jianhua Li, Weifeng Mou, Huatao Zhu

Abstract:Optical wireless communication (OWC) leveraging single-photon avalanche diode (SPAD) arrays offers exceptional sensitivity for photon-starving links. However, the inherent dead time of SPADs critically limits achievable data rates by introducing non-linear photon-counting distortions: blocking loss within a symbol duration and inter-symbol interference (ISI) across durations. This paper proposes a unified analytical framework capturing both distortions across all operational speed regimes for pulse-amplitude modulation (PAM), by establishing comprehensive statistical models for SPAD array receivers. For low and medium-speed systems (symbol duration longer than dead time), we derive exact closed-form expressions for the photon counts probability distribution using renewal theory, explicitly incorporating blocking loss and ISI. For high-speed systems (symbol duration shorter than dead time), we develop a Markov chain model characterizing the steady-state operational states and integrate it with trigger probability to obtain the exact binomial photon counts distribution. Furthermore, we propose low-complexity, near-optimal threshold detection schemes based on these models. This work provides essential theoretical tools for designing and optimizing high-performance SPAD-based OWC systems employing PAM.

Via

Access Paper or Ask Questions

Automatic Attenuation Control for Mitigating Photon-Counting Saturation in SPAD-based Optical Wireless Communications

May 27, 2026

Chen Wang, Zhiyong Xu, Jingyuan Wang, Jianhua Li, Weifeng Mou, Huatao Zhu

Abstract:Single-photon avalanche diodes (SPADs) have emerged as a promising candidate for optical wireless communication (OWC) owing to their ultra-high sensitivity and singlephoton detection capability. However, under strong background radiation or high signal power, SPAD-based receivers suffer from photon-counting saturation, which severely degrades communication performance. To address this challenge, this paper introduces an automatic attenuation control (AAC) technique that dynamically optimizes the incident optical intensity to mitigate saturation effects. We develop a comprehensive analytical model for the SPAD-based OWC system, incorporating the influence of dead time and the lack of photon-number resolution. Based on this model, a convex optimization-based AAC algorithm is proposed to maximize the achievable rate in real time. Furthermore, a low-complexity AAC algorithm is devised using a closed-form trigger probability criterion, reducing computational complexity by two orders of magnitude. Numerical results demonstrate that the proposed AAC technique significantly improves both the achievable rate and symbol error rate across a wide range of background conditions, providing an efficient solution to enhance the dynamic range of photon-counting receivers.

Via

Access Paper or Ask Questions

G-DRAGON: Geospatial Reasoning and Dynamic Planning for Retrieval-Augmented Outdoor Navigation

May 25, 2026

Dongzhihan Wang, Yi Du, Jianan Sun, Yuan Xue, Yingchen Zhang, Bing Xiao, Chen Wang, Liang Xu

Abstract:Autonomous ground robots operating in large-scale outdoor environments require both robust long-range navigation and fine-grained ''last-mile'' exploration. Current advances in visual-language navigation (VLN) work well at short-range tasks, lacking geospatial grounding for long-distance missions. Some OpenStreetMap (OSM)-based methods relying on cloud-based Large Language Models (LLMs) are prone to factual hallucination and cannot conduct ''last-mile'' exploration based on human instruction. To address these challenges, we present G-DRAGON, a retrieval-augmented framework for outdoor, open-world navigation. This framework maps natural-language commands to versioned, local OSM entities via generative retrieval based on lightweight LLM, yielding accurate coordinates for global route planning. A high-level planning module bridges global topological routes with the SLAM system, projecting geospatial waypoints into the robot's navigable frame. For the ''last mile," the framework transitions to frontier-based exploration and open-set semantic voxel mapping to localize open-vocabulary targets. Experimental results in simulation demonstrate our framework outperforms state-of-the-art baselines. Furthermore, we validate the system in unseen real-world urban environments on an Unmanned Ground Vehicle (UGV), successfully completing person-search missions with trajectories of up to 500m.

* Accepted by IEEE Robotics and Automation Letters (RA-L)

Via

Access Paper or Ask Questions

PhyWorld: Physics-Faithful World Model for Video Generation

May 19, 2026

Pu Zhao, Juyi Lin, Timothy Rupprecht, Arash Akbari, Chence Yang, Rahul Chowdhury, Elaheh Motamedi, Arman Akbari, Yumei He, Chen Wang(+3 more)

Abstract:World simulators can provide safe and scalable environments for training Physical AI systems before real-world deployment. Large video generation models are emerging as a promising basis for such simulators because they can generate diverse and realistic visual futures. However, using them as world simulators requires physically faithful video continuations, namely, generated videos that preserve the physical state implied by the conditioning input, and evolve in ways consistent with basic physical principles. We propose PhyWorld, a video generation world model designed to produce temporally coherent and physically faithful scene continuations through two-stage post-training. In the first stage, we improve video-to-video continuation with flow matching fine-tuning, encouraging stable visual attributes and coherent motion dynamics across frames. In the second stage, we align generated dynamics with physical principles using Direct Preference Optimization (DPO) over physics preference pairs, guiding the model toward outputs with higher physical plausibility. To evaluate PhyWorld, we use both standard video-quality benchmarks and a dedicated physical-faithfulness benchmark with per-law scoring. Experiments show that PhyWorld improves video consistency, achieving an average score of 0.769 on VBench compared with 0.756 or below for state-of-the-art baselines. PhyWorld also improves physical plausibility, reaching an average score of 3.09 on our physical-faithfulness benchmark compared with 2.99 for the strongest baseline. These results suggest that post-training large video generation models with continuation and physics-preference signals can make them more effective world simulators for Physical AI.

Via

Access Paper or Ask Questions

Augmented Set-membership Affine Projection Algorithm and Its Performance Analysis

May 18, 2026

Xinnian Guo, Haiquan Zhao, Chen Wang, Xiaoqiang Long, Yalin Liu, Wenjing Luo

Abstract:The augmented affine projection algorithm (AAPA) has considerably excellent performance for highly colored input signals. However, the direct matrix inversion operation leads to a high computational complexity, especially with high projection order. Inspired by the excellent characteristics of set-membership filtering (SMF), this paper proposes the augmented set-membership affine projection algorithm (ASM-APA), which not only has low computational complexity but also offers improved performance compared with AAPA. Then, the computational complexity and stability of ASM-APA are analyzed, and the condition for maintaining the stability of the algorithm is provided. Finally, in the computer simulation phase, the results of the simulation experiments demonstrated that ASM-APA has superior performance compared to AAPA.

Via

Access Paper or Ask Questions

Price of Fairness in Short-Term and Long-Term Algorithmic Selections

May 07, 2026

Shahin Jabbari, Chen Wang

Abstract:Algorithmic decision-making in high-stakes settings can have profound impacts on individuals and populations. While much prior work studies fairness in static settings, recent results show that enforcing static fairness constraints may exacerbate long-run disparities. Motivated by this tension, we study a stylized sequential selection problem in which a decision-maker repeatedly selects individuals, affecting both immediate utility and the population distribution over time. We introduce notions of group fairness for both the short and long term and theoretically analyze the trade-off between fairness and utility via the Price of Fairness (PoF). We characterize optimal and fair policies in the short term and show that the PoF can be large even when group distributions are nearly identical. In contrast, we show that long-term disparities can vanish under simple investment policies that achieve a low PoF. We also empirically validate these theoretical observations using both synthetic and real datasets.

* The short version of this paper appears in the proceedings of IJCAI-26

Via

Access Paper or Ask Questions

Shared Autonomy Assisted by Impedance-Driven Anisotropic Guidance Field

May 04, 2026

Sihan Chen, Hang Xu, Yupu Lu, Chen Wang, Benfang Duan, Ruixing Jia, Jia Pan

Abstract:Shared autonomy (SA) enables robots to infer human intent and assist in its achievement. While most research focuses on improving intent inference, it overlooks whether humans can understand the robot's intent in return. Without such mutual understanding, collaboration becomes less effective, degrading user experience and task performance. To address this gap, previous studies have explicitly conveyed the robot intent through additional interfaces, which remain unintuitive and limited in expressiveness. Inspired by impedance control, we propose Impedance-Driven Anisotropic Guidance Field Enhanced Shared Autonomy (IAGF-SA), a novel paradigm that extends SA with an embodied, physically-grounded communication channel. This channel adaptively modulates the robot's dynamic response to human input, enabling intuitive, continuous, physically-grounded robot intent communication while naturally guiding human actions. User studies across three scenarios and two teleoperation interfaces indicate that IAGF-SA improves task performance, human-robot agreement, and subjective experience, thus demonstrating its effectiveness in enhancing human-robot communication and collaboration.

* 8 pages, 7 figures. Accepted for publication in IEEE Robotics and Automation Letters

Via

Access Paper or Ask Questions